The impact of Gene Ontology evolution on GO-Term Information Content

نویسندگان

  • Pietro Hiram Guzzi
  • Giuseppe Agapito
  • Marianna Milano
  • Mario Cannataro
چکیده

The Gene Ontology (GO) is a major bioinformatics ontology that provides structured controlled vocabularies to classify gene and proteins function and role. The GO and its annotations to gene products are now an integral part of functional analysis. Recently, the evaluation of similarity among gene products starting from their annotations (also referred to as semantic similarities) has become an increasing area in bioinformatics. While many research on updates to the structure of GO and on the annotation corpora have been made, the impact of GO evolution on semantic similarities is quite unobserved. Here we extensively analyze how GO changes that should be carefully considered by all users of semantic similarities. GO changes in particular have a big impact on information content (IC) of GO terms. Since many semantic similarities rely on calculation of IC it is obvious that the study of these changes should be deeply investigated. Here we consider GO versions from 2005 to 2014 and we calculate IC of all GO Terms considering five different formulation. Then we compare these results. Analysis confirm that there exists a statistically significant difference among different calculation on the same version of the ontology (and this is quite obvious) and there exists a statistically difference among the results obtained with different GO version on the same IC formula. Results evidence there exist a remarkable bias due to the GO evolution that has not been considered so far. Possible future works should keep into account this consideration.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Content-Based Gene Ontology Semantic Similarity Approaches: Toward a Unified Framework Theory

Several approaches have been proposed for computing term information content (IC) and semantic similarity scores within the gene ontology (GO) directed acyclic graph (DAG). These approaches contributed to improving protein analyses at the functional level. Considering the recent proliferation of these approaches, a unified theory in a well-defined mathematical framework is necessary in order to...

متن کامل

Using the Protein-protein Interaction Network to Identifying the Biomarkers in Evolution of the Oocyte

Background Oocyte maturity includes nuclear and cytoplasmic maturity, both of which are important for embryo fertilization. The development of oocyte is not limited to the period of follicular growth, and starts from the embryonic period and continues throughout life. In this study, for the purpose of evaluating the effect of the FSH hormone on the expression of genes, GEO access codes for this...

متن کامل

Impact of ontology evolution on functional analyses

MOTIVATION Ontologies are used in the annotation and analysis of biological data. As knowledge accumulates, ontologies and annotation undergo constant modifications to reflect this new knowledge. These modifications may influence the results of statistical applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. Here, we investigate ...

متن کامل

Semantic Similarity Definition over Gene Ontology by Further Mining of the Information Content

The similarity of two gene products can be used to solve many problems in information biology. Since one gene product corresponds to several GO (Gene Ontology) terms, one way to calculate the gene product similarity is to use the similarity of their GO terms. This GO term similarity can be defined as the semantic similarity on the GO graph. There are many kinds of similarity definitions of two ...

متن کامل

Bi-directional semantic similarity for gene ontology to optimize biological and clinical analyses

BACKGROUND Semantic similarity analysis facilitates automated semantic explanations of biological and clinical data annotated by biomedical ontologies. Gene ontology (GO) has become one of the most important biomedical ontologies with a set of controlled vocabularies, providing rich semantic annotations for genes and molecular phenotypes for diseases. Current methods for measuring GO semantic s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016